issue/334 增加AutoInfinilmProcessor基建 by PanZezhong1725 · Pull Request #335 · InfiniTensor/InfiniLM

PanZezhong1725 · 2026-04-29T07:32:11Z

No description provided.

PanZezhong1725 · 2026-04-29T08:11:18Z

@@ -0,0 +1,34 @@
+class InfinilmProcessor:


这个文件是核心修改。

多模态模型引入之后，不同模型有不同的处理输入message的逻辑。
处理过程可抽象为三步：

apply chat template：返回文本，注意这里的文本不能直接encode，而需要调用考虑多模态输入的process

process：传入template好的prompt、所有图片视频等，返回processed_input（包含pytorch张量，hf功能限制导致）

batch：将scheduler output中的所有request的processed_input整合成infinicore tensor的batch（比如加入continuous batching所需的输入）

PanZezhong1725 · 2026-04-29T08:14:51Z

九格7B服务测试结果正确：

PanZezhong1725 · 2026-05-09T01:34:27Z

test_infer.py:

test_benchmark.py:

bench.py:

pengcheng888 · 2026-05-09T08:23:48Z

-        enable_graph_compiling=enable_graph,
-        attention_backend=attn_backend,
-        kv_cache_dtype=cfg.kv_cache_dtype,
+    model = LLM(


这样改后，离线推理单测的脚本，也会走到了服务的调度和cache管理的流程么

离线单测来说，要经过调度、cache的队列、block分配。

更新pd分离服务后，调度和 cache管理两部分中都会加上 kv_connecter的逻辑，以及 forward前后也会有kv_connecter的代码逻辑。

这样对于jiuge.py的来说，经过的代码是不是太多了

issue/334 增加AutoInfinilmProcessor基建 #335

PanZezhong1725 changed the title ~~issue/334 add processor infra~~ issue/334 增加AutoInfinilmProcessor基建 Apr 29, 2026

PanZezhong1725 commented Apr 29, 2026

View reviewed changes

PanZezhong1725 marked this pull request as ready for review April 29, 2026 08:12

PanZezhong1725 requested review from a team, ma-hang and wooway777 April 29, 2026 08:12

issue/334 add processor infra

74f9440

PanZezhong1725 force-pushed the issue/334 branch from eb17c78 to 74f9440 Compare May 7, 2026 01:18

PanZezhong1725 mentioned this pull request May 7, 2026

issue/334 增加AutoInfinilmProcessor基建 #335 #347

Merged

PanZezhong1725 closed this May 7, 2026

PanZezhong1725 added 4 commits May 7, 2026 01:40

issue/334 remove original model input construction

38db81e

issue/334 move static model input construction

626463d

issue/334 refactor test_infer.py

dcf4b6c

issue/334 use auto processor in bench and benchmark test

1471d56

PanZezhong1725 reopened this May 9, 2026

pengcheng888 reviewed May 9, 2026

View reviewed changes

wooway777 added a commit that referenced this pull request May 9, 2026

Merge pull request #347 from InfiniTensor/issue/334

5425749

issue/334 增加AutoInfinilmProcessor基建 #335

wooway777 approved these changes May 9, 2026

View reviewed changes

wooway777 merged commit 1c831b8 into main May 9, 2026

wooway777 deleted the issue/334 branch May 9, 2026 10:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue/334 增加AutoInfinilmProcessor基建#335

issue/334 增加AutoInfinilmProcessor基建#335
wooway777 merged 5 commits intomainfrom
issue/334

PanZezhong1725 commented Apr 29, 2026

Uh oh!

PanZezhong1725 Apr 29, 2026 •

edited

Loading

Uh oh!

PanZezhong1725 commented Apr 29, 2026

Uh oh!

PanZezhong1725 commented May 9, 2026

Uh oh!

pengcheng888 May 9, 2026

Uh oh!

pengcheng888 May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

PanZezhong1725 commented Apr 29, 2026

Uh oh!

PanZezhong1725 Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PanZezhong1725 commented Apr 29, 2026

Uh oh!

PanZezhong1725 commented May 9, 2026

Uh oh!

pengcheng888 May 9, 2026

Choose a reason for hiding this comment

Uh oh!

pengcheng888 May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PanZezhong1725 Apr 29, 2026 •

edited

Loading